39 research outputs found

    Heterogeneous hand gesture recognition using 3D dynamic skeletal data

    Get PDF
    International audienceHand gestures are the most natural and intuitive non-verbal communication medium while interacting with a computer, and related research efforts have recently boosted interest. Additionally, the identifiable features of the hand pose provided by current commercial inexpensive depth cameras can be exploited in various gesture recognition based systems, especially for Human-Computer Interaction. In this paper, we focus our attention on 3D dynamic gesture recognition systems using the hand pose information. Specifically, we use the natural structure of the hand topology-called later hand skeletal data-to extract effective hand kinematic descriptors from the gesture sequence. Descriptors are then encoded in a statistical and temporal representation using respectively a Fisher kernel and a multi-level temporal pyramid. A linear SVM classifier can be applied directly on the feature vector computed over the whole presegmented gesture to perform the recognition. Furthermore, for early recognition from continuous stream, we introduced a prior gesture detection phase achieved using a binary classifier before the final gesture recognition. The proposed approach is evaluated on three hand gesture datasets containing respectively 10, 14 and 25 gestures with specific challenging tasks. Also, we conduct an experiment to assess the influence of depth-based hand pose estimation on our approach. Experimental results demonstrate the potential of the proposed solution in terms of hand gesture recognition and also for a low-latency gesture recognition. Comparative results with state-of-the-art methods are reported

    3D Human Video Retrieval: from Pose to Motion Matching

    Get PDF
    International audience3D video retrieval is a challenging problem lying at the heart of many primary research areas in computer graphics and computer vision applications. In this paper, we present a new 3D human shape matching and motion retrieval framework. Our approach is formulated using Extremal Human Curve (EHC) descriptor extracted from the body surface and a local motion retrieval achieved after motion segmentation. Matching is performed by an efficient method which takes advantage of a compact EHC representation in open curve Shape Space and an elastic distance measure. Moreover, local 3D video retrieval is performed by dynamic time warping (DTW) algorithm in the feature space vectors. Experiments on both synthetic and real 3D human video sequences show that our approach provides an accurate shape similarity in video compared to the best state-of-the-art approaches. Finally, results on motion retrieval are promising and show the potential of this approach

    Extremal Human Curves: a New Human Body Shape and Pose Descriptor

    Get PDF
    Shape and pose similarityInternational audienceAutomatic estimation of 3D shape similarity from video is a very important factor for human action analysis, but also a challenging task due to variations in body topology and the high dimensionality of the pose configuration space.We consider the problem of 3D shape similarity in 3D video sequence for different actors and motions. Most current approaches use conventional global features as a shape descriptor and define the shape similarity using L2 distance. However, such methods are limited to coarse representation and do not sufficiently reflect the pose similarity of human perception. In this paper, we present a novel 3D human pose descriptor called Extremal Human Curves (EHC), extracted from both the spatial and the topological dimensions of body surface. To compare tow shapes, we use an elastic metric in Shape Space between their descriptors, based on static features, and then perform temporal convolutions, thereby capturing the pose information encoded in multiple adjacent frames. We quantitatively analyze the effectiveness of our descriptors for both 3D shape similarity in video and content-based pose retrieval for static shape, and show that each one can contribute, sometimes substantially, to more reliable human shape and pose analysis. Experimental results are promising and show the robustness and accuracy of the proposed approach by comparing the recognition performance against several stateof- the-art methods

    Accurate 3D Action Recognition using Learning on the Grassmann Manifold

    Get PDF
    International audienceIn this paper we address the problem of modelling and analyzing human motion by focusing on 3D body skeletons. Particularly, our intent is to represent skeletal motion in a geometric and efficient way, leading to an accurate action-recognition system. Here an action is represented by a dynamical system whose observability matrix is characterized as an element of a Grassmann manifold. To formulate our learning algorithm, we propose two distinct ideas: (1) In the first one we perform classification using a Truncated Wrapped Gaussian model, one for each class in its own tangent space. (2) In the second one we propose a novel learning algorithm that uses a vector representation formed by concatenating local coordinates in tangent spaces associated with different classes and training a linear SVM. %\cite{Turaga:2011:PAMI:ActionOnGrassman} We evaluate our approaches on three public 3D action datasets: MSR-action 3D, UT-kinect and UCF-kinect datasets; these datasets represent different kinds of challenges and together help provide an exhaustive evaluation. The results show that our approaches either match or exceed state-of-the-art performance reaching 91.21\% on MSR-action 3D, 97.91\% on UCF-kinect, and 88.5\% on UT-kinect. Finally, we evaluate the latency, i.e. the ability to recognize an action before its termination, of our approach and demonstrate improvements relative to other published approaches

    Motion Segment Decomposition of RGB-D Sequences for Human Behavior Understanding

    Get PDF
    International audienceIn this paper, we propose a framework for analyzing and understanding human behavior from depth videos. The proposed solution first employs shape analysis of the human pose across time to decompose the full motion into short temporal segments representing elementary motions. Then, each segment is characterized by human motion and depth appearance around hand joints to describe the change in pose of the body and the interaction with objects. Finally , the sequence of temporal segments is modeled through a Dynamic Naive Bayes classifier, which captures the dynamics of elementary motions characterizing human behavior. Experiments on four challenging datasets evaluate the potential of the proposed approach in different contexts, including gesture or activity recognition and online activity detection. Competitive results in comparison with state of the art methods are reported

    Space-time Pose Representation for 3D Human Action Recognition

    Get PDF
    International audience3D human action recognition is an important current challenge at the heart of many research areas lying to the modeling of the spatio-temporal information. In this paper, we propose representing human actions using spatio-temporal motion trajectories. In the proposed approach, each trajectory consists of one motion channel corresponding to the evolution of the 3D position of all joint coordinates within frames of action sequence. Action recognition is achieved through a shape trajectory representation that is learnt by a K-NN classifier, which takes benefit from Riemannian geometry in an open curve shape space. Experiments on the MSR Action 3D and UTKinect human action datasets show that, in comparison to state-of-the-art methods, the proposed approach obtains promising results that show the potential of our approach

    SHREC'17 Track: 3D Hand Gesture Recognition Using a Depth and Skeletal Dataset

    Get PDF
    International audienceHand gesture recognition is recently becoming one of the most attractive field of research in pattern recognition. The objective of this track is to evaluate the performance of recent recognition approaches using a challenging hand gesture dataset containing 14 gestures, performed by 28 participants executing the same gesture with two different numbers of fingers. Two research groups have participated to this track, the accuracy of their recognition algorithms have been evaluated and compared to three other state-of-the-art approaches

    The IMMED Project: Wearable Video Monitoring of People with Age Dementia

    Get PDF
    International audienceIn this paper, we describe a new application for multimedia indexing, using a system that monitors the instrumental activities of daily living to assess the cognitive decline caused by dementia. The system is composed of a wearable camera device designed to capture audio and video data of the instrumental activities of a patient, which is leveraged with multimedia indexing techniques in order to allow medical specialists to analyze several hour long observation shots efficiently

    CLASSIFICATION MULTI VUES DE RÉGIONS COULEUR - APPLICATION A L'ÉVALUATION 3D DES PLAIES CHRONIQUES

    No full text
    Despite the advance of functional exploration based on sophisticated medical imaging techniques, the digitization of anatomical surfaces still rely on manual imprecise and expensive clinical practice. From color images acquired with a hand held digital camera, an innovative tool for assessing chronic wounds has been developed. It combines both types of assessment, namely color analysis and dimensional measurement of injured tissue, in a user-friendly system, to provide the largest spreading in care staffs. Based on a ground truth established by clinicians, a sample database of wound tissue images has been constructed. These samples come from unsupervised color image segmentation after color correction to ensure stability under lighting conditions, viewpoint and camera type changes. They are characterized by color and texture descriptors, selected and re-sampled with data analysis techniques, before the learning stage of four categories of tissues of a Support Vector Machine classifier with perceptron kernel. The results of single view classification are merged and directly mapped on the mesh surface of the 3D wound model captured using uncalibrated vision techniques applied on a stereoscopic image pair. The result is a significative improvement in the robustness of the classification, equally stable over several reconstructions. The exact tissue areas are simply obtained by retro projection of the tissue regions on the 3D model. This geometric model is also strengthened since the automatic delineation of the wound uses skin detection to remove extra triangles from the mesh.Alors que l'exploration fonctionnelle repose sur des techniques d'imagerie médicale sophistiquées, le relevé anatomique de surface fait encore appel à des pratiques cliniques manuelles imprécises et coûteuses. A partir d'images couleur prises à main levée avec un appareil photo numérique, un outil innovant d'évaluation des plaies chroniques a été développé. Il combine les deux modes d'examen pratiqués, l'analyse colorimétrique et la mesure dimensionnelle des tissus lésés, dans un système convivial, pour une diffusion massive dans les équipes de soin. S'appuyant sur une vérité terrain établie par des cliniciens, une base d'échantillons cutanés a été constituée. Ils sont issus d'une segmentation non supervisée d'image couleur après correction colorimétrique assurant l'indépendance aux conditions d'éclairage, aux changements de point de vue et d'appareil. Ils sont ensuite caractérisés par des descripteurs de couleur et de texture, sélectionnés et re-conditionnés par des techniques d'analyse de données, pour faire l'apprentissage des quatre catégories de tissus par un séparateur à vaste marge à noyau perceptron. Les résultats de classification mono vue sont alors fusionnés grâce au modèle 3D de la plaie qui établit les correspondances spatiales d'une paire d'images stéréoscopiques. Il en résulte une nette amélioration de la robustesse de la classification, également stable sur plusieurs reconstructions. Les surfaces tissulaires exactes sont obtenues par simple rétro projection des régions tissulaires sur le modèle 3D. Ce modèle géométrique est également renforcé puisque le détourage automatique de la plaie utilise la détection de peau saine pour éliminer des triangles du maillage
    corecore